{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6FvzT_HQXz9J"
      },
      "source": [
        "# Verifying workable TF Serving\n",
        "\n",
        "This tutorial shows:\n",
        "- how to run TF Serving for a custom model in Docker container\n",
        "- how to request for predictions via both gRPC and RestAPI calls\n",
        "- the prediction timing result from TF Serving\n",
        "\n",
        "This notebook is written by referencing the [official TF Serving gRPC example](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/resnet_k8s.yaml) and [official TF Serving RestAPI example](https://www.tensorflow.org/tfx/tutorials/serving/rest_simple)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Com8Mcu2Xz9L"
      },
      "source": [
        "### Imports"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "b-aGIWy8c2Ht"
      },
      "outputs": [],
      "source": [
        "!pip install -q requests\n",
        "!pip install -q tensorflow-serving-api"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "6lQVylcMXz9N"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import tempfile\n",
        "import pandas as pd\n",
        "import tensorflow as tf\n",
        "import numpy as np\n",
        "import json\n",
        "import requests\n",
        "\n",
        "# gRPC request specific imports\n",
        "import grpc\n",
        "from tensorflow_serving.apis import predict_pb2\n",
        "from tensorflow_serving.apis import prediction_service_pb2_grpc"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GoIj2728pLyw"
      },
      "source": [
        "## Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3xmYCIWpXz9N"
      },
      "source": [
        "### Get a sample model \n",
        "\n",
        "The target model is the plain `ResNet50` trained on ImageNet."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "VysQtJQnXz9O",
        "outputId": "c63abf81-65e7-48c9-9e71-d16108da2d2a"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5\n",
            "102973440/102967424 [==============================] - 2s 0us/step\n",
            "102981632/102967424 [==============================] - 2s 0us/step\n"
          ]
        }
      ],
      "source": [
        "core = tf.keras.applications.ResNet50(include_top=True, input_shape=(224, 224, 3))\n",
        "\n",
        "inputs = tf.keras.layers.Input(shape=(224, 224, 3), name=\"image_input\")\n",
        "preprocess = tf.keras.applications.resnet50.preprocess_input(inputs)\n",
        "outputs = core(preprocess, training=False)\n",
        "model = tf.keras.Model(inputs=[inputs], outputs=[outputs])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "p3bC--0GXz9O"
      },
      "source": [
        "### Save the model\n",
        "\n",
        "Below code saves the model under `MODEL_DIR`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "z9AmyovhXz9O",
        "outputId": "c26eadf0-e06e-45e4-a40d-e00b5343154e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "export_path = /tmp/1\n",
            "\n",
            "WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.\n",
            "INFO:tensorflow:Assets written to: /tmp/1/assets\n",
            "\n",
            "Saved model:\n",
            "total 4040\n",
            "drwxr-xr-x 2 root root    4096 Mar 23 07:32 assets\n",
            "-rw-r--r-- 1 root root  557217 Mar 23 07:32 keras_metadata.pb\n",
            "-rw-r--r-- 1 root root 3565545 Mar 23 07:32 saved_model.pb\n",
            "drwxr-xr-x 2 root root    4096 Mar 23 07:32 variables\n"
          ]
        }
      ],
      "source": [
        "MODEL_DIR = tempfile.gettempdir()\n",
        "version = 1\n",
        "export_path = os.path.join(MODEL_DIR, str(version))\n",
        "print('export_path = {}\\n'.format(export_path))\n",
        "\n",
        "tf.keras.models.save_model(\n",
        "    model,\n",
        "    export_path,\n",
        "    overwrite=True,\n",
        "    include_optimizer=True,\n",
        "    save_format=None,\n",
        "    signatures=None,\n",
        "    options=None\n",
        ")\n",
        "\n",
        "print('\\nSaved model:')\n",
        "!ls -l {export_path}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VV7onOD2Xz9P"
      },
      "source": [
        "### Examine your saved model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "baanYnt8ohM7"
      },
      "source": [
        "TensorFlow comes with a handy `saved_model_cli` tool to investigate saved model.\n",
        "\n",
        "Notice from `signature_def['serving_default']:` \n",
        "- the input name is `image_input`\n",
        "- the output name is `resnet50`\n",
        "\n",
        "You need to know these to make requests to the TF Serving server later"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Lgzz06XoXz9Q",
        "outputId": "c51a85f9-c6bf-4e7e-f710-2a572fde45d6"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\n",
            "MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:\n",
            "\n",
            "signature_def['__saved_model_init_op']:\n",
            "  The given SavedModel SignatureDef contains the following input(s):\n",
            "  The given SavedModel SignatureDef contains the following output(s):\n",
            "    outputs['__saved_model_init_op'] tensor_info:\n",
            "        dtype: DT_INVALID\n",
            "        shape: unknown_rank\n",
            "        name: NoOp\n",
            "  Method name is: \n",
            "\n",
            "signature_def['serving_default']:\n",
            "  The given SavedModel SignatureDef contains the following input(s):\n",
            "    inputs['image_input'] tensor_info:\n",
            "        dtype: DT_FLOAT\n",
            "        shape: (-1, 224, 224, 3)\n",
            "        name: serving_default_image_input:0\n",
            "  The given SavedModel SignatureDef contains the following output(s):\n",
            "    outputs['resnet50'] tensor_info:\n",
            "        dtype: DT_FLOAT\n",
            "        shape: (-1, 1000)\n",
            "        name: StatefulPartitionedCall:0\n",
            "  Method name is: tensorflow/serving/predict\n",
            "\n",
            "Concrete Functions:\n",
            "  Function Name: '__call__'\n",
            "    Option #1\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #2\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          image_input: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='image_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #3\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #4\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          image_input: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='image_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "\n",
            "  Function Name: '_default_save_signature'\n",
            "    Option #1\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          image_input: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='image_input')\n",
            "\n",
            "  Function Name: 'call_and_return_all_conditional_losses'\n",
            "    Option #1\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          image_input: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='image_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #2\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          image_input: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='image_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #3\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #4\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n"
          ]
        }
      ],
      "source": [
        "!saved_model_cli show --dir {export_path} --all"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5NBTdC7jXz9Q"
      },
      "source": [
        "## TF Serving"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "g4G_oWb_plDI"
      },
      "source": [
        "### Create dummy data\n",
        "\n",
        "The dummy data is nothing but just contains random numbers in the batch size of 32."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "loMcfbfTXz9S",
        "outputId": "0b18aed6-af3b-4e06-b5ae-e336305b0d5e"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "TensorShape([32, 224, 224, 3])"
            ]
          },
          "execution_count": 6,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dummy_inputs = tf.random.normal((32, 224, 224, 3))\n",
        "dummy_inputs.shape"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZBxKquAwpCEh"
      },
      "source": [
        "### Install TF Serving tool"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mDVz8VnnXz9Q"
      },
      "outputs": [],
      "source": [
        "!echo \"deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \\\n",
        "curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -\n",
        "!sudo apt update"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "u2HQb4q6sonS"
      },
      "outputs": [],
      "source": [
        "!sudo apt-get install tensorflow-model-server"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jn9oNYM7pYcv"
      },
      "source": [
        "### Run TF Serving server"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "metadata": {
        "id": "MH-RScSvXz9R"
      },
      "outputs": [],
      "source": [
        "os.environ[\"MODEL_DIR\"] = MODEL_DIR"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "`saved_model_cli` CLI accepts a set of options.\n",
        "- `--rest_api_port` exposes additional port for RestAPI. By default `8500` is exposed as gRPC.\n",
        "- `--model_name` lets TF Serving to identify which model to access. You can visually see this in the RestAPI's URI.\n",
        "- `--enable_model_warmup` \n",
        "  - The TensorFlow runtime has components that are lazily initialized, which can cause high latency for the first request/s sent to a model after it is loaded. To reduce the impact of lazy initialization on request latency, it's possible to trigger the initialization of the sub-systems and components at model load time by providing a sample set of inference requests along with the SavedModel. This process is known as \"warming up\" the model.\n",
        "  - To trigger warmup of the model at load time, attach a warmup data file under the assets.extra subfolder of the SavedModel directory.\n",
        "  - `--enable_model_warmup` option triggers this process.\n",
        "  - for further information, please look at the [official document](https://www.tensorflow.org/tfx/serving/saved_model_warmup?hl=en)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Mq4t5ozVXz9R"
      },
      "outputs": [],
      "source": [
        "!nohup tensorflow_model_server \\\n",
        "  --rest_api_port=8501 \\\n",
        "  --model_name=resnet_model \\\n",
        "  --model_base_path=$MODEL_DIR >server.log 2>&1 &\n",
        "\n",
        "# --enable_model_warmup for warmup(https://www.tensorflow.org/tfx/serving/saved_model_warmup)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 28,
      "metadata": {
        "id": "PVhTO53jXz9S"
      },
      "outputs": [],
      "source": [
        "!cat server.log"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ea6V73oXzs3U"
      },
      "source": [
        "Notice that two ports are exposed for listening both RestAPI(`8501`) and gRPC(`8500`)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "KumZ3xB4giEa",
        "outputId": "4302ee0a-994f-485a-c99b-4fa394068d64"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "node         7 root   21u  IPv6  25789      0t0  TCP *:8080 (LISTEN)\n",
            "colab-fil   30 root    5u  IPv4  26644      0t0  TCP *:3453 (LISTEN)\n",
            "colab-fil   30 root    6u  IPv6  26645      0t0  TCP *:3453 (LISTEN)\n",
            "jupyter-n   43 root    6u  IPv4  25864      0t0  TCP 172.28.0.2:9000 (LISTEN)\n",
            "python3     61 root   15u  IPv4  27814      0t0  TCP 127.0.0.1:50215 (LISTEN)\n",
            "python3     61 root   18u  IPv4  27818      0t0  TCP 127.0.0.1:54779 (LISTEN)\n",
            "python3     61 root   21u  IPv4  27822      0t0  TCP 127.0.0.1:40395 (LISTEN)\n",
            "python3     61 root   24u  IPv4  27826      0t0  TCP 127.0.0.1:60517 (LISTEN)\n",
            "python3     61 root   30u  IPv4  27832      0t0  TCP 127.0.0.1:40255 (LISTEN)\n",
            "python3     61 root   43u  IPv4  28831      0t0  TCP 127.0.0.1:53235 (LISTEN)\n",
            "python3     81 root    3u  IPv4  29267      0t0  TCP 127.0.0.1:15144 (LISTEN)\n",
            "python3     81 root    5u  IPv4  28223      0t0  TCP 127.0.0.1:42197 (LISTEN)\n",
            "python3     81 root    9u  IPv4  28356      0t0  TCP 127.0.0.1:41627 (LISTEN)\n",
            "tensorflo 5933 root    5u  IPv4  66554      0t0  TCP *:8500 (LISTEN)\n",
            "tensorflo 5933 root   12u  IPv4  66559      0t0  TCP *:8501 (LISTEN)\n"
          ]
        }
      ],
      "source": [
        "!sudo lsof -i -P -n | grep LISTEN"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mvHRnTmppqn9"
      },
      "source": [
        "## RestAPI request"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1QNMoU3qq2fN"
      },
      "source": [
        "### Convert dummy data in JSON format"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "xDw6gT7lXz9S",
        "outputId": "19e82468-1441-4dae-c514-9d8b78f53240"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Data: {\"signature_name\": \"serving_default\", \"instances\": ... 442383, 0.8007770776748657, -0.7472004890441895]]]]}\n"
          ]
        }
      ],
      "source": [
        "data = json.dumps({\"signature_name\": \"serving_default\", \"instances\": dummy_inputs.numpy().tolist()})\n",
        "print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hzlArynZq-dF"
      },
      "source": [
        "### Make a request"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 31,
      "metadata": {
        "id": "hh6vmxqnXz9T"
      },
      "outputs": [],
      "source": [
        "headers = {\"content-type\": \"application/json\"}"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 32,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "fS_DI5QpdZTg",
        "outputId": "6b93ca46-1047-4b76-bd84-d21f95f9f4e9"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "1 loop, best of 5: 4.11 s per loop\n"
          ]
        }
      ],
      "source": [
        "%%timeit\n",
        "json_response = requests.post('http://localhost:8501/v1/models/resnet_model:predict', \n",
        "                              data=data, headers=headers)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_n8urXddrJp0"
      },
      "source": [
        "### Interpret the output"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 36,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "brI4TCETXz9T",
        "outputId": "7a098027-fb8a-4bfd-96dc-0bf74d26af09"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Prediction class: [664 664 664 664 664 664 664 664 664 664 664 664 664 664 664 664 664 664\n",
            " 664 664 664 664 664 664 664 851 664 664 851 664 664 664]\n"
          ]
        }
      ],
      "source": [
        "json_response = requests.post('http://localhost:8501/v1/models/resnet_model:predict', \n",
        "                              data=data, headers=headers)\n",
        "rest_predictions = json.loads(json_response.text)['predictions']\n",
        "print('Prediction class: {}'.format(np.argmax(rest_predictions, axis=-1)))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lwuaqGVud58k"
      },
      "source": [
        "## gRPC request"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yr7vO8BQrP2S"
      },
      "source": [
        "### Open up gRPC channel"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 37,
      "metadata": {
        "id": "Y1cxieBDfyjK"
      },
      "outputs": [],
      "source": [
        "channel = grpc.insecure_channel('localhost:8500')\n",
        "stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fS5b0VVfrTmF"
      },
      "source": [
        "### Prepare a request"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 38,
      "metadata": {
        "id": "2QD8xK47emy5"
      },
      "outputs": [],
      "source": [
        "request = predict_pb2.PredictRequest()\n",
        "request.model_spec.name = 'resnet_model'\n",
        "request.model_spec.signature_name = 'serving_default'\n",
        "request.inputs['image_input'].CopyFrom(\n",
        "    tf.make_tensor_proto(dummy_inputs)) #, shape=[32,224,224,3]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7UztXjZGrYTf"
      },
      "source": [
        "### Make a request"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 39,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "wvslcT5_f4P6",
        "outputId": "e4a1008a-dad5-4f8f-a0f6-f9b68aa7db67"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "1 loop, best of 5: 3.63 s per loop\n"
          ]
        }
      ],
      "source": [
        "%%timeit\n",
        "result = stub.Predict(request, 10.0)  # 10 secs timeout"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o52MnidprdCY"
      },
      "source": [
        "### Interpret the output"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 40,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "TBfd4TG0f5z6",
        "outputId": "ffc40a16-787e-4146-9c1e-27317d10867f"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Prediction class: [664 664 664 664 664 664 664 664 664 664 664 664 664 664 664 664 664 664\n",
            " 664 664 664 664 664 664 664 851 664 664 851 664 664 664]\n"
          ]
        }
      ],
      "source": [
        "grpc_predictions = stub.Predict(request, 10.0)  # 10 secs timeout\n",
        "grpc_predictions = grpc_predictions.outputs['resnet50'].float_val\n",
        "grpc_predictions = np.array(grpc_predictions).reshape(32, -1)\n",
        "print('Prediction class: {}'.format(np.argmax(grpc_predictions, axis=-1)))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_dVHrF1ksyAc"
      },
      "source": [
        "## Compare the two results if they are identical\n",
        "\n",
        "`np.testing.assert_allclose` raises exception when the given two arrays do not match exactly."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 41,
      "metadata": {
        "id": "UA4iKEcpioc8"
      },
      "outputs": [],
      "source": [
        "np.testing.assert_allclose(rest_predictions, grpc_predictions, atol=1e-4)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9-UcGFM0z65y"
      },
      "source": [
        "## Conclusion\n",
        "\n",
        "gRPC call took about 3.64 seconds while RestAPI call took about 4.11 seconds on the data of the batch size of 32. This let use conclude that gRPC call is much faster than RestAPI. \n",
        "\n",
        "Also note that this is very close performance comparing to the ONNX inference without any Server framework involved. That means we can expect TF Serving with gRPC should be faster than ONNX hosted on FastAPI server framework since FastAPI is a python framework while TF Serving is C++ implementation."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "DbLQp2Do0k6H"
      },
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "name": "TF_Serving.ipynb",
      "provenance": [],
      "toc_visible": true
    },
    "interpreter": {
      "hash": "626869861cd3ed4fdbaf755d0ab61c53ee2a93056f2b69c4f7170d3cc24dc5ea"
    },
    "kernelspec": {
      "display_name": "Python 3.8.12 ('.venv': venv)",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.12"
    },
    "orig_nbformat": 4
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
